-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[tests][R-package] run all tests using 2 threads (fixes #5102) #5367
Conversation
@jameslamb trying the suggestion from #5102 (comment) I found that we needed to set the parameters in many more places, and for example |
@jmoralez ah ok, the point about But to be honest, I don't understand how the new approach using How exactly are you testing that this change restricts LightGBM to using 2 threads? Seeing how much effort this is taking, and how many different places have to be modified, I think we might want to consider closing this and instead focusing on #4705, then using the solution to #4705 to satisfy #5102. |
I was relying on the OMP to check for the env variable, this time I'm manually reading it.
It was due to how OMP works, it just reads the env variable at program startup, not at runtime.
Looking at the htop while running
I think that #4705 has two possible solutions:
In either case I think it wouldn't help with #5102 because for that we want to globally set the number of threads to a number or set an upper bound for them. And in each of them we could use this approach because if we do 1. we can leave the proposed code here as is and it would work, and in 2. we could modify this to set the default number of threads for lightgbm instead of OMP. |
ok @jmoralez now that we got v3.3.3 out, I'm ready to come back and help move this forward. I'm sorry, but I don't support expanding LightGBM's API this way, or adding another layer of complexity to how LightGBM chooses how many threads to use. We can say "this environment variable is internal only", but we can't actually enforce that, and this PR would add the first use of I don't think that's worth it for the narrow purpose of "more closely follow CRAN's suggested guidelines, which they've never complained about before". Given that CRAN has never rejected a submission for I recommend that we just do what #5102 initially proposed in its description, and follow the same approach that was used for verbosity in #4862. Thank you again for investigating these other different options in this PR! But I think we should just move forward with something that is simple and known to work for this case, even though it feels inelegant and might miss a few cases. |
@jmoralez I saw you closed this. Are you planning to open another PR implementing the approach I suggested? If not, would you support one if I did? Totally understand if you feel like you don't want to put any more effort into this particular issue. |
I'd prefer to work on something else, I support it if you want to do it though. I'm not sure how CRAN measures the CPU usage of a package tests so as you say it probably doesn't have to be perfect. Because even if we set the threads for all trainings, every predict will use all of them and also some of the dataset constructions (#4598, #5153). |
Ok thanks @jmoralez ! And great points about I will pick this up at some point (in a new PR). |
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Adds the option to set the environment variable
LGBM_NUM_THREADS
to override all other sources that setomp_num_threads
. This allows to globally set the number of threads for all operations from LightGBM, which will be used to limit the number of threads to 2 for the R-package tests. This is meant only for internal use so it isn't documented but we could if someone thinks it would be useful.Fixes #5102.